Aside

Download a PDF of this CV

Contact

Languages

Technologies

Disclaimer

Main

David Zhang

By bridging bioinformatics and engineering, I translate genetic and transcriptomic data into software that delivers real-world impact. With experience across the full software development lifecycle, I design, build, and deploy tools to solve bioinformatic problems — from prototyping innovative solutions to implementing and maintaining robust, production-ready pipelines.

Work Experience

Senior bioinformatics engineer

CoSyne Therapeutics

London, UK (hybrid)

Present - 2024

  • Optimise and scale machine learning tools for single-cell perturb-seq data comprising millions of cells. Apply these tools to generate actionable insights and inform strategic decisions around company direction.
  • Design and deploy a data pipeline to ingest, tidy and version-control data for the CoSyne knowledge graph. Automate the release of the graph to AWS using terraform and CI/CD, improving the efficiency and traceability of data updates.
  • Build and maintain infrastructure tooling including docker images, terraform modules, CI/CD workflows and cruft templates to streamline bioinformatics analyses.

Senior bioinformatics software engineer

Congenica

Hinxton, UK (hybrid)

2024 - 2022

  • Developed scalable nextflow pipelines to process solid tumor DNA-sequencing data covering alignment, variant calling, driver mutation annotation, and therapy matching.
  • Built python and R packages to improve the efficiency of clinical verification, reducing time taken by 2 weeks per quarterly release.

Bioinformatician internship (2 months)

Verge Genomics

London, UK (remote)

2021

  • Created a reproducible aberrant splicing detection pipeline using docker for drug target discovery in C9orf72 ALS patients.

Education

PhD, Bioinformatics

University College London

London, UK

2022 - 2017

  • Analysed bulk RNA-sequencing data with the aim of improving the diagnosis rate of rare disease patients. Focussed on detection of abberant splicing events as a strategy to prioritise pathogenic variants.
  • Released R/Bioconductor packages that enable bioinformatics analyses and interpretation. Championed best practices for software development through teaching workshops and courses.

MSc, Neuroscience

University College London

London, UK

2016 - 2015

  • Grade: Merit (68%)

BSc, Biomedical science

University College London

London, UK

2015 - 2012

  • Grade: 2:1 (69%)

Open-source software

Web development

N/A

N/A

Present - 2022

  • Portfolio website: Showcases my favourite open-source contributions. Built with Django and deployed using PythonAnywhere.

Rust packages

N/A

N/A

2024

  • tuni: Unify transcript identifiers across different samples.

Python packages

N/A

N/A

2023 - 2021

  • autogroceries: Use Selenium to automate your grocery shop.
  • stravaboard: An extendable Streamlit dashboard for tracking Strava runs.

R packages

N/A

N/A

2022 - 2020

  • ggtranscript: Visualising transcript structure and annotation using ggplot2.
  • dasper: Detection of aberrant splicing events in RNA-sequencing data.

Selected Publications

A complete list of my publications is available via Google Scholar

Developmental Consequences of Defective ATG7-Mediated Autophagy in Humans

The New England Journal of Medicine

N/A

2021

  • Role: Analyst

Megadepth: efficient coverage quantification for BigWigs and BAMs

Bioinformatics

N/A

2021

  • Role: R package developer